PPO-Exp: Keeping Fixed-Wing UAV Formation with Deep Reinforcement Learning

نویسندگان

چکیده

Flocking for fixed-Wing Unmanned Aerial Vehicles (UAVs) is an extremely complex challenge due to fixed-wing UAV’s control problem and the system’s coordinate difficulty. Recently, flocking approaches based on reinforcement learning have attracted attention. However, current methods also require that each UAV makes decision decentralized, which increases cost computation of whole system. This paper researches a low-cost formation system consisting one leader (equipped with intelligence chip) five followers (without chip), proposes centralized collision-free formation-keeping method. The communication in process considered protocol designed by minimizing cost. In addition, analysis Proximal Policy Optimization (PPO) algorithm provided; derives estimation error bound, reveals relationship between bound exploration. To encourage agent balance their exploration version PPO named PPO-Exploration (PPO-Exp) proposed. It can adjust clip constraint parameter make mechanism more flexible. results experiments show PPO-Exp performs better than algorithms these tasks.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Powerline Perching with a Fixed-Wing UAV

Small and micro UAVs have enabled a number of new mission capabilities, including navigating in and around buildings and performing perch-and-stare surveillance. However, one of the primary limitations of these small vehicles is endurance, simply because they cannot carry sufficient power for long missions. Recent advances in fixed-wing perching have made it possible to consider a new solution ...

متن کامل

Experiments in Fixed-Wing UAV Perching

High-precision maneuvers at high angles-of-attack are not properly addressed by even the most advanced aircraft control systems. Here we present our control design procedure and indoor experimental results with a small fixed-wing autonomous glider which is capable of executing an aggressive high angle-of-attack maneuver in order to land on a perch. We first acquire a surprisingly accurate aircr...

متن کامل

End-to-End Deep Reinforcement Learning for Lane Keeping Assist

Reinforcement learning is considered to be a strong AI paradigm which can be used to teach machines through interaction with the environment and learning from their mistakes, but it has not yet been successfully used for automotive applications. There has recently been a revival of interest in the topic, however, driven by the ability of deep learning algorithms to learn good representations of...

متن کامل

Deep Reinforcement Learning with POMDPs

Recent work has shown that Deep Q-Networks (DQNs) are capable of learning human-level control policies on a variety of different Atari 2600 games [1]. Other work has looked at treating the Atari problem as a partially observable Markov decision process (POMDP) by adding imperfect state information through image flickering [2]. However, these approaches leverage a convolutional network structure...

متن کامل

Reinforcement Learning with Deep Architectures

There is both theoretical and empirical evidence that deep architectures may be more appropriate than shallow architectures for learning functions which exhibit hierarchical structure, and which can represent high level abstractions. An important development in machine learning research in the past few years has been a collection of algorithms that can train various deep architectures effective...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Drones

سال: 2022

ISSN: ['2504-446X']

DOI: https://doi.org/10.3390/drones7010028